Data Archiving and Optical Source Identification

Data Archiving and Optical Source Identification

An integral part of the SDSS is the construction of an archive from the vast data base which is secure, robust, portable and above all usable. The sheer size and richness of the SDSS data base both in terms of number of objects and information about each object, and the depth and complexity of the scientific questions it can address, necessitate an entirely new approach to data handling. The data base must incorporate tools to access and analyze the data. The project proposes to develop two archives, the Operational Archive (OA), which resides at Fermilab and which runs the survey (by keeping track of photometric and astrometric calibration, selecting spectroscopic targets from the photometric data base, designing the spectroscopic plates and deciding on the observing strategy from night to night) and the Science Archive (SA), which is accessible by scientists for data analysis and which will become the tool for public distribution of the SDSS data products. The requirements and construction of these data archives are described in considerable detail in Appendices C and D. The design will also allow linkages between the SDSS data base and the large archives from NASA's 2MASS and WIRE missions, should this prove to be desirable and feasible.

An important aspect of the construction of both the operational and science data bases is the incorporation of catalogues obtained at other wavelengths. These serve two main purposes for the OA; they aid the reduction of the photometric data by flagging the locations of troublesome objects such as very bright stars, and they will also allow us, on a non-interference basis with the main survey, to target for spectroscopic observation objects of interest at other wavelengths which do not match our primary selection criteria for spectroscopic targets and are bright enough that we can expect to obtain spectra of adequate signal-to-noise ratio (see section 3.8). Further, the software is able to cut from the photometric data an atlas image of a source even if that source is not detected during the photometric data reduction. Such atlas images could be subjected to further analysis "off line", for example co-addition of the data in all five bands, co-addition of atlas images from parts of the sky covered more than once, and so on.

Use of other catalogues in the SA allows the optical identification of sources and multiwavelength studies. As part of the science for which support is requested, this proposal explicitly offers the optical identification of sources observed at other wavelengths from selected large catalogues from NASA missions. We propose to find every optical object detected by the SDSS within the positional error box of a source detected at another wavelength, and to calculate for each optical object the likelihood that it is, in fact, identical with the source. This likelihood is based on two criteria: first, the positional discrepancy and second, the properties of the optical and target sources. For example, an extended ROSAT source is more likely to be associated with a cluster of galaxies within its error box than with a nearby red star, while the association is likely to be the other way around for an unresolved 2MASS source. The SDSS will provide shape classification data as well as accurate positions, flux densities and colors, making source identification quite reliable. The software tools we propose to develop for this purpose are also described in Appendix D.